AITopics | corner weight

Collaborating Authors

corner weight

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sample-Efficient Multi-Objective Learning via Generalized Policy Improvement Prioritization

Alegre, Lucas N., Bazzan, Ana L. C., Roijers, Diederik M., Nowé, Ann, da Silva, Bruno C.

arXiv.org Artificial IntelligenceMar-23-2023

Multi-objective reinforcement learning (MORL) algorithms tackle sequential decision problems where agents may have different preferences over (possibly conflicting) reward functions. Such algorithms often learn a set of policies (each optimized for a particular agent preference) that can later be used to solve problems with novel preferences. We introduce a novel algorithm that uses Generalized Policy Improvement (GPI) to define principled, formally-derived prioritization schemes that improve sample-efficient learning. They implement active-learning strategies by which the agent can (i) identify the most promising preferences/objectives to train on at each moment, to more rapidly solve a given MORL problem; and (ii) identify which previous experiences are most relevant when learning a policy for a particular agent preference, via a novel Dyna-style MORL method. We prove our algorithm is guaranteed to always converge to an optimal solution in a finite number of steps, or an $\epsilon$-optimal solution (for a bounded $\epsilon$) if the agent is limited and can only identify possibly sub-optimal policies. We also prove that our method monotonically improves the quality of its partial solutions while learning. Finally, we introduce a bound that characterizes the maximum utility loss (with respect to the optimal solution) incurred by the partial solutions computed by our method throughout learning. We empirically show that our method outperforms state-of-the-art MORL algorithms in challenging multi-objective tasks, both with discrete and continuous state and action spaces.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.5555/3545946.3598872

2301.07784

Country:

Asia > Singapore (0.04)
Oceania > New Zealand (0.04)
North America > Puerto Rico (0.04)
(18 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Efficient Methods for Multi-Objective Decision-Theoretic Planning

Roijers, Diederik Marijn (University of Amsterdam)

AAAI ConferencesJul-15-2015

In decision-theoretic planning problems, such as (partially observable) Markov decision problems or coordination graphs, agents typically aim to optimize a scalar value function. However, in many real-world problems agents are faced with multiple possibly conflicting objectives. In such multi-objective problems, the value is a vector rather than a scalar, and we need methods that compute a coverage set, i.e., a set of solutions optimal for all possible trade-offs between the objectives. In this project propose new multi-objective planning methods that compute the so-called convex coverage set (CCS): the coverage set for when policies can be stochastic, or the preferences are linear. We show that the CCS has favorable mathematical properties, and is typically much easier to compute that the Pareto front, which is often axiomatically assumed as the solution set for multi-objective decision problems.

Add feedback

Point-Based Planning for Multi-Objective POMDPs

Roijers, Diederik Marijn (University of Amsterdam) | Whiteson, Shimon (University of Amsterdam) | Oliehoek, Frans A. (University of Liverpool)

AAAI ConferencesJul-15-2015

Many sequential decision-making problems require an agent to reason about both multiple objectives and uncertainty regarding the environment's state. Such problems can be naturally modelled as multi-objective partially observable Markov decision processes (MOPOMDPs). We propose optimistic linear support with alpha reuse (OLSAR), which computes a bounded approximation of the optimal solution set for all possible weightings of the objectives. The main idea is to solve a series of scalarized single-objective POMDPs, each corresponding to a different weighting of the objectives. A key insight underlying OLSAR is that the policies and value functions produced when solving scalarized POMDPs in earlier iterations can be reused to more quickly solve scalarized POMDPs in later iterations. We show experimentally that OLSAR outperforms, both in terms of runtime and approximation quality, alternative methods and a variant of OLSAR that does not leverage reuse.

corner weight, solvescalarizedpomdp, vector, (14 more...)

AAAI Conferences

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

Europe > United Kingdom > England > Merseyside > Liverpool (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Computing Convex Coverage Sets for Faster Multi-objective Coordination

Roijers, Diederik Marijn, Whiteson, Shimon, Oliehoek, Frans A.

Journal of Artificial Intelligence ResearchMar-31-2015

In this article, we propose new algorithms for multi-objective coordination graphs (MO-CoGs). Key to the efficiency of these algorithms is that they compute a convex coverage set (CCS) instead of a Pareto coverage set (PCS). Not only is a CCS a sufficient solution set for a large class of problems, it also has important characteristics that facilitate more efficient solutions. We propose two main algorithms for computing a CCS in MO-CoGs. Convex multi-objective variable elimination (CMOVE) computes a CCS by performing a series of agent eliminations, which can be seen as solving a series of local multi-objective subproblems. Variable elimination linear support (VELS) iteratively identifies the single weight vector, w, that can lead to the maximal possible improvement on a partial CCS and calls variable elimination to solve a scalarized instance of the problem for w. VELS is faster than CMOVE for small and medium numbers of objectives and can compute an ε-approximate CCS in a fraction of the runtime. In addition, we propose variants of these methods that employ AND/OR tree search instead of variable elimination to achieve memory efficiency. We analyze the runtime and space complexities of these methods, prove their correctness, and compare them empirically against a naive baseline and an existing PCS method, both in terms of memory-usage and runtime. Our results show that, by focusing on the CCS, these methods achieve much better scalability in the number of agents than the current state of the art.

agent, ccs, vector, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4550

AI Access Foundation

10933

Journal of Artificial Intelligence Research

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Orange County > Irvine (0.04)
North America > Canada > British Columbia (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.36)

Add feedback